Interacting RNA polymerase motors on DNA track: 
effects of traffic congestion and intrinsic noise on RNA synthesis 
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RNA polymerase (RNAP) is an enzyme that synthesizes a messenger RNA (mRNA) strand which 
is complementary to a single-stranded DNA template. From the perspective of physicists, an RNAP 
is a molecular motor that utilizes chemical energy input to move along the track formed by a 
DNA. In many circumstances, which are described in this paper, a large number of RNAPs move 
simultaneously along the same track; we refer to such collective movements of the RNAPs as RNAP 
traffic. Here we develop a theoretical model for RNAP traffic by incorporating the steric interactions 
between RNAPs as well as the mechano-chemical cycle of individual RNAPs during the elongation 
of the mRNA. By a combination of analytical and numerical techniques, we calculate the rates 
of mRNA synthesis and the average density profile of the RNAPs on the DNA track. We also 
introduce, and compute, two new measures of fluctuations in the synthesis of RNA. Analyzing 
these fluctuations, we show how the level of intrinsic noise in mRNA synthesis depends on the 
concentrations of the RNAPs as well as on those of some of the reactants and the products of 
the enzymatic reactions catalyzed by RNAP. We suggest appropriate experimental systems and 
techniques for testing our theoretical predictions. 

PACS numbers: 87. 16. Ac 89.20.-a 
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I. INTRODUCTION 

Molecular motors [l|, 0, Q are either proteins or macro- 
molecular complexes that utilize some form of input en- 
ergy (often chemical energy) to perform mechanical work. 
In many circumstances, molecular motors move collec- 
tively on a single track in a manner that has strong resem- 
blance with vehicular traffic [J, [|| . In recent years some 
minimal models of molecular motor traffic have been de- 
veloped to study their generic features @, 0, H, ■ More 
detailed models for specific motor traffic systems have 
also been proposed by capturing the stochastic mechano- 
chemistry of individual motors as well as their steric in- 
teractions within the same model to investigate the inter- 
play of individual and collective dynamics of the motors 
[lOt [Til Il2| . In this paper we develop such a model for 
a specific class of motors for which no attempt has been 
made in the past to capture their steric interactions dur- 
ing traffic-like collective movements on a single track. 

According to the central dogma of molecular biology, 
the genetic message stored in the DNA is first transcribed 
into messenger RNA (mRNA) from which it is then trans- 
lated into proteins. Polymerization of a mRNA from 
the corresponding single-stranded DNA (ssDNA) tem- 
plate is carried out by a motor called RNA polymerase 
(RNAP) [H G3, [Di. In contrast, synthesis of a pro- 
tein from the corresponding mRNA template is medi- 
ated by another motor, called ribosome, which translo- 
cates along the mRNA strand. The steric interactions 
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between the neighbouring ribosomes, which simultane- 
ously translocate along the same mRNA, were taken 
into account in most of the theoretical models of trans- 
lation developed since the late sixties [l(| EE EE 
HI M HE Hi HI H US HI- Surprisingly, in spite 
of the close similarities between the template-dictated 
and motor-driven polymerization of macromolecules in 
transcription and translation, no attempt has been made 
in the past to incorporate interactions of RNAPs in the 
theoretical description of transcription. Instead, to our 
knowledge, all the models of transcription reported so far 
[H M m M MMMMMMMM capture only 
the stochatic mechano-chemistry of the individual RNAP 
motors. Cooperation and collisions between RNAP mo- 
tors is known to have non-trivial effects on the rate of 
transcription [39|, HE 52 • Moreover, the possibility of 
the formation of queues in RNAP traffic has also been ex- 
plored (43|. In fact, if the gene is relatively short, a suffi- 
ciently long queue of RNAPs on the ssDNA template can 
reduce the accessibility of the promoter sequence thereby 
lowering the rate of further initiation of transcription. 

The main aim of this paper is to develop a model of 
RNAP traffic that incorporates steric interactions be- 
tween RNAP motors which move along the same DNA 
track. In this model, we incorporate the most essential 
features of the multi-step mechano-chemical pathway of 
the individual RNAP motors by a scheme which was used 
earlier in Wang et al.'s [28] model for single RNAP. The 
steric interaction between the RNAPs is assumed to be 
hard-core repulsion. The effects of these interactions of 
RNAPs is captured in our model of mRNA synthesis in 
the same manner in which the steric interactions of ri- 
bosomes was captured in a recent model 10] of protein 
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synthesis. 

In the spirit of traffic science [44|, we define the flux 
to be the average number of motors crossing a site per 
unit time. Thus, flux is expressed in the units "number 
per second". Wc define the number density to be the 
average number of RNAPs attached to unit length of the 
DNA template. Using the terminology of traffic science, 
we refer to the relation between the flux and the number 
density of the RNAPs as the fundamental diagram. We 
calculate the flux and investigate its dependence on the 
number density of RNAP on the DNA template as well 
as on some other experimentally accessible parameters 
of the model. Since average speed of a RNAP is also a 
measure of the average rate of mRNA elongation and the 
flux gives the total rate of mRNA synthesis from a DNA 
template, our calculations predict the effects of RNAP 
traffic congestion on the rate of synthesis of mRNA. 

The steps of the mechano-chemical cycle of a RNAP 
are intrinsically stochastic and give rise to fluctuations 
in the rates of synthesis of mRNA. We introduce quan- 
titative measures of these fluctuations by drawing anal- 
ogy with some further concepts from traffic science pij . 
We define the run time T of a RNAP to be the actual 
time it takes to travel from the start site to the stop site 
on the DNA template (i.e., the time taken to synthe- 
size an mRNA transcript). Similarly, we define the time- 
headway t to be the time interval between the departures 
of two successive RNAPs from the stop site on the DNA 
template (i.e., the time interval between the completion 
of the synthesis of successive mRNA transcripts). Using 
the stochastic model which we develope here for RNAP 
traffic, we also compute the distributions Pt and V T of 
run-times and time-headways respectively. 

In recent years, stochasticity in gene expression has 
been probed by novel experimental techniques and the 
results have inspired several theoretical models at differ- 
ent levels of complexity (4f| . The cell-to-cell variations in 
the levels of expression of the same gene can arise from in- 
herently intrinsic fluctuations in transcription and trans- 
lation or from extrinsic causes [46j . Since proteins are the 
final products of gene expression, normally, fluctuations 
in the concentration of proteins are taken as a measure 
of the noise in gene expression. However, the most direct 
way to measure transcriptional noise would be to moni- 
tor the fluctuations in the synthesis of mRNA transcripts 
S3, IH, H H3|- Therefore, instead of modeling cell-to- 
cell variations in the transcription of a specific gene, in 
this paper we study the RNAP-to-RNAP fluctuations in 
the synthesis of mRNA from a single DNA template. The 
width of the distributions Pt and V T provide measures 
of the contributions to transcriptional noise from the in- 
trinsic fluctuations in the steps of the mechano-chemical 
cycle of RNAPs on the same DNA template. 

The paper is organized as follows. In section|TT]we sum- 
marize the essential mechano-chemical processes involved 
in transcription. In the same section we also present a 



brief review of some of the relevant earlier models. Our 
stochastic model is developed in section|nTJ Our theoreti- 
cal predictions on flux and average density profiles, which 
follow from this model under periodic and open boundary 
conditions, are discussed in the sections ITVl andlVl respec- 
tively. Our results on fluctuations and transcriptional 
noise are presented in section PVTl The experimental im- 
plications of our theoretical predictions are discussed in 
section I VIII Finally, in section IVIIII we summarize our 
main theoretical predictions. 



II. BRIEF REVIEW OF PHENOMENOLOGY 
AND EARLIER MODELS 

A. Essential chemo-mechanical processes 

DNA and RNA are linear polymers whose monomeric 
subunits are called nucleotides. Transcription, i.e., the 
process of synthesis of mRNA from the corresponding ss- 
DNA template, can be broadly divided into three stages, 
namely, initiation^ elongation and termination. In the 
initiation stage, an RNAP recognizes the so-called "pro- 
moter sequence" on the DNA and locally unzips the two 
DNA strands creating a "bubble" whereby a ssDNA tem- 
plate is exposed to it. However, in this paper we are 
interested mainly in the elongation of the mRNA tran- 
script. 

During elongation [5l| . each successful addition of a 
nucleotide to the elongating mRNA leads to a forward 
stepping of the RNAP. The RNAP, together with the 
DNA bubble and the growing RNA transcript, forms a 
"transcription elongation complex" (TEC). The essential 
components of each of the TECs are shown explicitly in 
the schematic depiction of RNAP traffic in figUJa). As 
reported in the literature [28j . the typical size of a tran- 
scription bubble is about 15 nucleotides (i.e., about 5 nm) 
whereas a single RNAP covers a DNA segment that can 
be as long as 35 nucleotides (i.e., about 12 nm). The non- 
template DNA strand remains in single-stranded confor- 
mation in the bubble region while a 8-10 nucleotide-long 
DNA-RNA hetero-duplex is formed by a part of the tem- 
plate DNA strand and the growing end of the RNA (see 
figUTa)). 

Each mechano-chemical cycle of the RNAP during the 
elongation stage [HI, [EH consists of several steps; 
the major steps being (i) Nucleoside triphosphate (NTP) 
binding to the active site of the RNAP when the active 
site is located at the growing tip of the mRNA transcript, 
(ii) NTP hydrolysis, (iii) release of pyrophosphate (PPi), 
one of the products of hydrolysis, and (iv) accompanying 
forward stepping of the RNAP This simplified sce- 
nario, which is adequete for our purpose here, is shown 
symbolically in equation {!]): 
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TEC n + NTP TEC n • NTP ^ TEC n+1 • PP l ^ TBC n+ i 



The elongation process ends when the TEC encoun- 
ters the corresponding "termination sequence" and the 
nascent mRNA is released by the RNAP. 



B. Brief review of the earlier models 

A stochastic chemical kinetic model was developed by 
Jiilicher and Bruinsma [27j to describe not only the poly- 
merization of mRNA by a RNAP, but also to account for 
the effects of elastic strain in the motor. Almost simulta- 
neously, Wang et al. [HI developed a model that incorpo- 
rated the multi-step chemical kinetics of the transcription 
elongation process. Extending Von Hippel's [H3, HH, 
pioneering works on sequence-dependent thermodynamic 
analysis of transcription, Wang and collaborators [l3l |32| 
have developed a sequence-dependent kinetic model in 
terms of a transcription-energy landscape. This model 
has been extended further by Tadigotla et al. [35( by 
incorporating the kinetic barriers erected by the folding 
of the mRNA transcript. Very recently, Bai et al. [33| 
have demonstrated the predictive power of their theoret- 
ical model carrying out experiment and data analysis in 
two stages: in the first they estimated the model param- 
eters from experimental exploration of the response to 
chemical perturbations, and, then, in the second stage 
using these parameters they predicted the responses to 
mechanical perturbations. But, as stated in the introduc- 
tion, none of these models incorporate steric interactions 
between the RNAPs. 

To our knowledge, the first model of molecular mo- 
tor traffic was developed almost forty years ago by Mac- 
Donald, Gibbs and collaborators [la, LLZ| in the context 
of ribosome traffic. In the pioneering works Il7j |. 
as well as in most of the extensions in recent years 
QJ, [H, M, M, El, [H, H EH, i| , the details of molecu- 
lar composition and achitecture as well as the mechano- 
chemical cycles of the ribosomes were not taken into ac- 
count. Instead, each ribosome was modelled as a hard 
rod; in the special limit where the size of the rod coin- 
cides with the lattice constant, this model reduces to the 
totally asymmetric simple exclusion process (TASEP) 
which is the simplest model of interacting self-propelled 
particles. Very recently, a more realistic model [lOj of 
ribosome traffic has been developed by incorporating the 
essenial steps in the mechano-chemical cycle of a ribo- 
some during the elongation of the protein. Traffic of some 
other families of motors have also been modelled recently 
in the same spirit, i.e., by incorporating both the intra- 
motor mechano-chemistry and inter-motor steric interac- 
tions [ni- 
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III. MODEL 




(b) 




FIG. 1: (a) A schematic representation of RNAP traffic where 
the three dashed squares represent three TECs. The solid 
lines connecting filled circles represent the two strands of the 
double-stranded DNA while the string of open circles denotes 
the elongating RNA molecule. The dashed lines connecting 
the circles denote the unbroken non-covalent bonds between 
the complementary subunits on the DNA and RNA strands. 
Each of the grey ovals represents the catalytic site on the 
corresponding RNAP. (b) A simplified version of the figHTa) • 
The DNA track for the RNAP motors is assumed to be, ef- 
fectively, an one-dimensional lattice. Each TEC has been 
replaced by a rectangular black box that can cover r lattice 
sites simultaneously (r = 6 in this figure) . The RNAP in each 
TEC can exist in either of the two chemical states. 

For the purpose of quantitative modeling, we simplify 
the schematic picture of RNAP traffic shown in figfTJa). 
We represent the DNA track for RNAP motors by a one- 
dimensional lattice and each TEC by a rectangular box 
(see figQJb)). Although the actual size of a TEC may be 
slightly larger than that of the associated RNAP, from 
now onwards, in this paper we shall ignore this size differ- 
ence. In other words, we assume that the size of the black 
box in the figfjjb) is identical to that of a TEC as well as 
that of a RNAP motor. We label the sites of the lattice 
by the integer index i (by convention, from left to right). 
The sites i = 1 and i = L represent the start and stop 
sites, respectively. Each of the remaining sites in between 
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the start and stop sites (i.e., 2 < i < L — 1) represents 
a single nucleotide on the DNA template. The size of a 
single RNAP is such that each motor can simultaneously 
cover r successive nucleotides on the DNA template (usu- 
ally, r is typically 30 to 35 base pairs, but in fi^T]r = 6). 
According to our convention, the position of each RNAP 
is denoted by the integer index of the lattice site cov- 
ered by the leftmost site of the RNAP. Thus, the allowed 
range of the positions j of each RNAP is 1 < j < L. 
The hard-core steric interactions among the RNAPs is 
captured by imposing the condition that no lattice site is 
allowed to be covered simultaneously by more than one 
RNAP. Irrespective of the actual numerical value of r, 
each RNAP can move forward or backward by only one 
site in each time step, if demanded by its own mechano- 
chemistry, provided the target site is not already covered 
by any other RNAP. This is motivated by the fact that 
a RNAP must transcribe the successive nucleotides one 
by one. 

The total number of RNAPs on the DNA template 
is denoted by the symbol N. Under periodic boundary 
conditions (PBC), N is independent of time whereas N is 
a fluctuating time-dependent quantity if open boundary 
conditions (OBC) are imposed on the system. Therefore, 
p = N/L is the number density of the RNAPs. The 
coverage density is defined by p cov — Nr/L = p r which 
is the total fraction of the nucleotides covered by all the 
RNAPs together. Under OBC, the number density as 
well as the coverage density are, in general, fluctuating 
quantities, but the average of these densities attain time- 
independent values in the stationary state. 
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FIG. 2: A schematic representation of the mechano-chemical 
cycle of each RNAP in our model in the elongation stage. 
No PPi is bound to the RNAP in the state 1 whereas the 
PPi-bound state of the RNAP is labelled by the index 2. 

Our model is aimed at the elongation stage and is not 
intended to describe the initiation and termination pro- 
cesses in detail. Therefore, we represent initiation and 
termination by the two parameters a and (3, respectively. 
Whenever the site i = 1 on the DNA template is va- 
cant, this site is allowed to be occupied by a new RNAP 
with the probability a in the time interval At (in all 



our numerical calculations we take At = 0.001s). Sim- 
ilarly, a RNAP bound to the site i = L is allowed to 
detach from the template with the probability in the 
time interval At. For convenience, we also define the 
probabilities ui a and uip for attachment and detachment, 
respectively. Note that uj a is related to a by the relation 
a = 1 — exp(— uj a At); top is related to f3 by a similar 
relation. 

Following Wang et al.[28j], we have a simplified descrip- 
tion of the chemical (or, conformational) states of each 
individual RNAP. Since release of PPi is the rate limiting 
step in the process of elongation of the mRNA transcript, 
we consider only two, effectively distinct, chemical states 
of the RNAP in each mechano-chemical cycle during the 
elongation stage. In the state labelled by the integer 
index 1 no PPi is bound to the RNAP whereas the PPi- 
bound state of the RNAP is labelled by the index 2. The 
simplified scheme, that captures the essential mechano- 
chemical processes during the mRNA transcript elonga- 
tion, is shown in figO ■ In this figure, w^ij and u>^ 2 
are the rates of polymerization of RNA in three difer- 
rent situations, namely, by the hydrolysis of nucleotides 
(i) on the RNAP, (ii) in solution (while no PPi is bound 
to the RNAP and (iii) in solution, while PPi is bound 
to the RNAP. The corresponding rates of reverse transi- 
tions, which result in depolymerization of the RNA, are 
denoted by the symbols u>\ 2 i w n an d respectively. 
Finally, LO21 and 1012 are the rates of association and dis- 
sociation, respectively, of PPi- 

"Backtracking" and "hypertracking" of RNAP have 
been observed in in-vitro single-RNAP experiments [5?], 
Effects of backtrackings on transcription has been 
investigated recently by Voliotis et al.[38(. However, the 
model used by Voliotis et al. [38| does not explicitly cap- 
ture the biochemical transitions of a RNAP during its 
enzymatic cycle. Interestingly, it has been experimen- 
tally demonstrated [1^, H(| that backtracking of a RNAP 
gets strongly suppressed if there is another RNAP close 
behind it. Therefore, we do not allow the possibility of 
backtracking in our model as, except at extremely low 
densities of RNAPs, backtrackings and hypertrackings 
are expected to be rare in RNAP traffic. 

Four different types of nucleotides are used by na- 
ture to synthesize all DNA molecules. Sequence inho- 
mogeneity can lead to site-dependent rates of transloca- 
tion of RNAP on its track. In the context of TASEP, 
which is a special limit of our model of RNAP traffic, 
effects of quenched random site-dependent hopping rates 
H HI M, HI HI H, HI HI HI , have been investigated 
extensively over the last decade. Moreover, Brownian 
motors with quenched disorder [gl, HI [7(| El have also 
been studied. In the same spirit, single molecular motors, 
which move on DNA or RNA tracks, have been modelled 
assuming the nucleotide sequence on the track to be ran- 
dom [zatzl- 

However, to our knowledge, for the realistic inhomoge- 
neous, but correlated, sequences no analytical technique 
is available at present for the calculation of the quanti- 
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ties of our interest in this paper. In fact, the theoreti- 
cal schemes developed so far for single RNAPs [HI, [35[ , 
which take into account the actual sequence of the spe- 
cific DNA track, are implemented numerically. Even in 
the context of earlier models of protein synthesis, almost 
all the theoretical results on the effects of sequence in- 
homogeneities have been obtained by computer simula- 
tions [lCj, [26| . Therefore, for the sake of ease of analytical 
calculations, throughout this paper we have ignored the 
sequence inhomogeneity of the nucleotides on the DNA 
template and, instead, assumed a hypothetical homoge- 
neous sequence. 

Let P^(i,t) denote the probability that there is a 
RNAP at the spatial position i and in the chemical state 
H at time t; fj, = 1 refers to the state in which the RNAP 
is not bound to any PPi whereas /i = 2 corresponds to 
the state with bound PP t . Note that P(i) = Y?u=i p m W 
is the probability of finding a RNAP at the site i, irre- 
spective of its chemical state. Similarly, P^ — J^. P^ii) is 
the probability of finding a RNAP in the chemical state 
\x irrespective of its spatial position. We describe the 
stochastic dynamics of the system in terms of master 
equations for P^ii, t). Most of our analytical results have 
been derived using the mean-field approximation. 

In order to test the range of validity of our approxi- 
mate analytical calculations, we have also carried out ex- 
tensive computer simulations (Monte Carlo simulations) 
of our model. In these simulations, we have used random 
sequential updating which appropriately corresponds to 
the Master equations used in our analytical formalisms. 
In each run of the simulations, the system was allowed to 
reach steady state in the first one million time steps and 
the data for the steady-state were collected over the next 
eight million time steps. The entire process was repated 
with large number of different initial conditions and, fi- 
nally, average steady-state flux was computed. We have 
observed that the qualitative features of our results do 
not depend significantly on the actual numerical value of 



r as long as is sufficiently larger than unity. Therefore, 
unless stated otherwise, all the numerical results plotted 
in this paper have been obtained taking r = 10. In our 
test simulation runs, we did not find any significant vari- 
ations in the data for L > 1000. Therefore, almost all 
the simulation data reported here were generated in our 
production runs by keeping L fixed at L — 1000. 



IV. RNAP TRAFFIC UNDER PERIODIC 
BOUNDARY CONDITIONS 

We always denote the spatial position of a RNAP on 
the DNA track by the integer index of the site covered 
by the left edge of the RNAP (i.e., the leftmost of the 
r successive sites representing the RNAP). Thus, in our 
terminology, a site is occupied by a RNAP if it coincides 
with the leftmost of the r sites representing that RNAP 
while the next r— 1 sites on its right are said to be covered 
by the same RNAP. 

Let P(i\j) be the conditional probability that, given 
a RNAP at site i, there is another RNAP at site j; the 
underlined index i within the bracket denotes the site 
whose occupational status is given. Obviously, Q(i\j) is 
the conditional probability that, given a RNAP at site 
i, site j is empty; the meaning of the underlined index 
i within the bracket is the same as in case of P. Note 
that, if site i is given to be occupied by one RNAP, the 
site i — 1 can be covered by another RNAP if, and only 
if, the site i — r is also occupied. 



A. Mean-field theory under periodic boundary 
conditions 

In the mean-field approximation, the master equations 
for Pfj,(i,t) are given by 



dPi(i,t) 
dt 



u\ x Pi(i + M) Q(i+l-r\i±T)+ u{ x I\(i-l,t) Q (t - l |t - 1 + r) + oj b 12 P 2 (i + l,t) Q(i + 1- r\ i + 1) 
w 12 iMM)- wai Pi (*,*)- <4i p iM Q(i\* + r) - uliPi(i,t)Q(i\i + r)- u b n Pi{i,t) Q(i - r\i) (2) 



dP2 ^ = ^22 P 2 (i + l,t) Q(i + 1 - r[j+l) + J 22 P 2 (i-l,t) Q (i-l \i-l + r)+ u f 21 P x (i-l,t) Q (i-l \i-l + r) 
+ w 2 iPi(M)- wuftCM)- J 22 Pt(i,t) Q(i\i + r) - uj\ 2 P»(i,t) Q(i - r\i) - uj\ 2 P 2 (i,t) Q(i - r\i) (3) 

I 

Note that the two equations @ and §5§ are not indepen- For our numerical calculations, we choose the same 
dent of each other because of the condition set of rate constants which Wang et al.[28| extracted 

from empirical data; these are as follows: 

P(i) = P 1 (i) + P 2 (i) = -=p (4) 
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9.4 s" 1 
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(5) 



where we have used the abbreviation NMP for nucleoside 
monophosphate . 

Compared to Basu and Chowdhury's model [l(| of 
ribosome-driven protein synthesis, our model of RNAP- 
driven mRNA synthesis involves fewer chemical states 
and, hence, fewer master equations. In fact, the number 
of chemical states in this model is equal to those in an 
earlier model of traffic of single-headed kinesin motors 
where the two chemical states, however, have to- 
tally different physical interpretations. But, the number 
of terms involved in each of the master equations EH a nd EH 
are much larger than those in ref. 10] and in ref. 11, 121. 



B. Steady state properties under periodic 
boundary conditions 

In the steady state all P M (i, t) become indepent of time. 
Moreover, because of the PBC, these probabilities are 
also independent of the site index i in the steady state of 
the system. Therefore, from Bayes's theorem, 



P(i\i + r) 



and, hence, 



Q(i\i 



P(i\ i + r )P(i + r) 

W) 
P(i\ i + r ) 



Q(i\i_ 



(6) 



(7) 



We calculate Q(i\i + r) along the same line as sketched in 
ref. [10( . Given that the site i is occupied, the conditional 
probability that the site i + r is also occupied is given by 



P(i\i + r) 

Thus, in the limit L — 
p = N/ L fixed, we get 



N - 1 



(8) 



L + N — Nr — 1 
00 and TV — > 00, while keeping 



Q(i\i + r) = Q(i\ i + r ) = 



1 — pr 
1 + p — pr 



(9) 



Note that Q vanishes at p — 1/r, because the entire 
stretch of the DNA template between the points of ini- 
tiation and termination of transcription is fully covered 
by the RNAPs at pr — p cov = 1. 
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FIG. 3: The steady-state flux of the RNAPs, under periodic 
boundary conditions, plotted as a function of the coverage 
density p cov for (a) three different values of [NTP] at [PPi] = 
1 pM, and (b) three different values of [PP 4 ] at [NTP] = 1 mM. 
The lines correspond to our mean-field theoretic predictions 
whereas the discrete data points have been obtained from 
computer simulations. 



Solving Eqs.([2|), together with f4| in the steady state 
under PBC, we get 



P = 



Pi = 



( W12 



^12^ 



L)21 + &2lQ 



(10) 



where 



f2| — LV12 + 



j 



Lu 2\ T w 12 

and Q is given by the equation ©. 



(11) 
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In the steady state under PBC, the flux of the RNAPs 
is given by 

J = (ub + Pi Q(i\i + r)+ J 22 P 2 Q(i\i + r) 
- co b n P t Q(i - r\i) - («$ 2 +wj a ) P 2 Q(i - r\i) 

(12) 

Hence, 

J = n 1 p l Q+ n 2 p 2 Q 

= ( fi x P X + 2 P 2 ) ( l ~ Pcov ) (13) 

\ 1 + p - pcov J 

where 

ill = u{x + to 2 i - Wn (14) 

^2 =u{ 2 -uj\ 2 - UJ 22 , (15) 

are two effective forward hopping rates from the states 1 
and 2, respectively, while Q is given by the equation 
Since Pi = = P 2 at p = the corresponding flux J 
vanishes. J also vanishes at p cov — 1 as Q vanishes at 
pr = 1. 

Our mean-field estimate (TT3"|) of flux J is plotted 
against the coverage density p cov in figl3] for (a) three 
different values of [NTP] at' [PP % ] = lpM, and (b) three 
different values of [PPi] at [NTP] = ImM . The qualita- 
tive features of these fundamental diagrams are similar 
to those observed earlier [l(| in the context of ribosomal 
traffic during protein synthesis from a mRNA template. 
The most notable feature of these diagrams is their asym- 
metric shape. This shape of the fundamental diagram is 
in sharp contrast to the symmetry of the fundamental 
diagram of TASEP about p = 1/2. The physical reason 
for the asymmetric shape of the fundamental diagram in 

fig[3]is the same as in ribosomal traffic [lol ]. 
f 

The rate constant uj 21 is higher at higher concentra- 
tions of NTP and gives rise to higher flux, i.e., higher rate 



of transcriptional output (see figE]). Conversely, higher 
concentration of PPi opposes the release of PPi thereby 
slowing down the overall rate of transcription. Moreover, 
at higher concentrations of NTP each RNAP attempts 
forward stepping more frequently; while making these 
attempts, it feels stronger hindrance at higher densities 
of RNAPs. Therefore, the deviation of the mean-field es- 
timates of flux from the corresponding simulation data is 
larger at higher NTP concentration and at higher cover- 
age density of the RNAPs. Similarly, forward stepping of 
a RNAP is less suppressed when the PPi concentration 
in the solution is lower; therefore, stronger devitation of 
the mean-field estimates of flux from the corresponding 
simulation data is observed at lower PPi conentration 
and higher RNAP densities. 



V. RESULTS UNDER OPEN BOUNDARY 
CONDITIONS 

Open boundary conditions are more realistic than PBC 
for describing RNAP traffic during transcription. A fresh 
RNAP can attach with the site i = 1 only in the state 
1 (i.e., no PPi is bound to it). In this section we make 
a further assumption for simplifying the equations. We 
replace the conditional probability Q(i\j), by the proba- 
bility Q(j) that site j is empty, irrespective of the state 
of occupation of any other site. Note that the proba- 
bility of finding a "hole" at j (i.e., the probability that 
the site j is not "covered" by any RNAP) is given by 



A. Mean-field theory under open boundary 
conditions 

Under mean-field approximation, the master equations 
for the probabilities are now given by 



dPi(l,t) 



= uj a f p ( s )j + <"ii PiM+ "12 P2M+ «ia P 2 (l,i) 

^ S — l ' 

- wai J 2l ) Pi(l,t)(- l -^--i Pa + s) 



dPi(i,t) 
dt 



io b n P 1 (i + l,t)+ u\ 2 P 2 (i + l,t) 



EI=i^(i + s) + ^(i + 



1 - YZ=i P(i + l-s)+P(i+l- r) 

: 1 V. 12 P 2 (M) 



Pl{l ht) {l- £1=1 P(i -l + s)+P(i-l + r) 

l-Y,UiP{i + s 



u 2 i Pi(i,t)-( Ull + u 21 ) Pi (i, t) p( . + s) + p{ , + r) 



wii Pi(i,t) 



1 - e; =1 p(* + p{* - r) 



(16) 



(17) 
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dPi(L,t) 
dt 



- ,.,/ 



lo[ x Pi(L-l,i) 



U21 Pi CM) - LJ° n Pi(L,t) 



1 - Es=i ^ - 1 + «) + P(L - 1+ r) 



±-E r s=iP(L- S ) + P(L- 



W12 ft (A*) 
- W/3 P X (M) 



dP 2 (i,t) 
dt 



to^ Pi(i-l,t)+ J 22 P 2 (i-l,t) 



i-EI=i^-i + ^) 



1 - ££ =1 P(i - 1 + a) + P(t - 1 + r) 



+ --^ + M)^ ELiP( . + 1 _ s) + p(i + 1 _ r 

+ c^ 21 Pj {i, t) - u> 12 P 2 {i, t) - J 22 P 2 {i, t) ' ~ — 1 ; ' ' ^ " ' 



-Y, r s= iP(i + s)+P(i + r) 
( lo\ 2 + ) P 2 (i, t) ( *~ 1 " "I r 



(18) 



(19) 



dP 2 (L,t) 
dt 



4, P 1 (L-l,t)+ J 22 P 2 (L-l,t) 



+ lj 21 Pi(M) - wia P 2 (M) - ( wJa + c^ 2 ) P 2 (M) 
- tup P 2 {L,t) 



1 - Y, r s= iP(L ~ 1 + s) + P(L -1+r) 



l-El=i^-s) + P(i-r) 



(20) 



B. Steady state properties under open boundary 
conditions 



Using these mean-field equations (|16H20[) in the steady 
state, we have numerically calculated our theoretical es- 
timates of the flux. These mean-field theoretic estimates 
are plotted as functions of the rate constants u) a and u> 21 , 
respectively, in figs[4ja) and (b). In order to test the level 
of accuracy of these approximate theoretical predictions, 
we have compared these results with the corresponding 
numerical data obatined from our direct computer sim- 
ulations of the model under open boundary conditions. 
We have also computed the average density profiles and 
plotted these profiles for three different values of uj a and 
three different values of ui 21 in the insets of figsfjja) and 
(b), respectively. 



The flux increases monotonically with increasing uj a as 
well as with increasing lo 21 and, eventually, saturates in 
both the cases. This trend of variation of flux is accom- 
panied by a monotonic rise of the average density profile 
of the RNAPs in figHKa) and with a monotonic fall of 
the average density profile in figHKb). A comparison of 
these qualitative features of the variation of flux and den- 
sity profiles with those in ribosome traffic [l(| , indicates 
a transition from the low-density phase to the maximal 
current phase in figHIa) and from the high-density phase 
to the maximal current phase in figfjjb) [ToL l74|. 



VI. RNAP-TO-RNAP FLUCTUATIONS AND 
TRANSCRIPTIONAL NOISE 

The distribution Pp of run times T is a measure of the 
RNAP-to-RNAP fluctuations in the rates of transcrip- 
tion. This distribution, obtained from computer simula- 
tions of our model, is plotted in figEfa) for two different 
values of the parameter uj 21 . The gaussian fit to the dis- 
tribution Pt is consistent with the gaussian distributions 
of the "delay times" obtained by Morelli and Jiilicher [75[ 
in the limit of sufficiently large number of intermediate 
steps. Gaussian distributions of the speeds of the RNAPs 
were observed by Tolic-Norrelykke et al.[76| in their in- 
vitro experiments. Although this conclusion in ref.[76j 
was based on the assumption of uniform speed of the 
RNAPs during the elongation stage, what was actually 
observed in their experiments is the Gaussian distribu- 
tion of the run times; this is certainly consistent with our 
theorertical result. 

We define the standard deviation, i.e., the root-mean- 
square deviations 

r/T =< (T- < T >) 2 > 1/2 (21) 

of run times T from their mean, as a measure of the tran- 
scriptional noise arising from the stochastic mechano- 
chemical cycles of the RNAPs. j]t is plotted as a function 
of uj 21 in the inset of figGUa). Since oj 21 — lo 2 ®[NTP], 
the inset of figGJa) clearly establishes that the transcrip- 
tional noise r/x falls exponentially with increasing concen- 
tration of NTP. This trend of variation is consistent with 
the well known fact that the fluctuations in the rates of 
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(a) 




* 500 1000 

* , , ,_ 



50 100 150 200 

co f 21 (s- 1 ) 

FIG. 4: The steady-state flux of the RNAPs, under open 
boundary conditions, plotted as a function of (a) u a , for three 
sets of values of the pair of parameters [NTP], [PP;]; (b) u>2i 
for three values of the parameter uj a . The lines correspond to 
our mean-field theoretic predictions whereas the discrete data 
points have been obtained from computer simulations. The 
insets show the average density profiles for (a) three different 
values of ui a and (b) three different values of Wjji ■ 



chemical synthesis are stronger when the concentrations 
of reactants are lower. 

The distribution V T of time-headways r is a measure 
of the fluctuations in the time interval between the com- 
pletion of the polymerization of successive mRNA tran- 
scripts. This distribution, also obtained from computer 
simulations of our model, is plotted in figOb) for the 
same values of the ur 2 \ as those used in figEIa). The best 
fit to the numerical data for V T is of the general form 

V T = CVe"^ (22) 

with positive constants fi and v; C being the nor- 
malization constant. The form (|22p is consistent with 



(a) 




1 2 3 4 5 

Time headway (s) 



FIG. 5: The distributions of the run times and time head- 
ways in our model, under open boundary conditions, plotted 
in (a) and (b), respectively, for two different values of w^. 
The discrete data points have been obtained from computer 
simulations. The curves fitted to these data points are drawn 
with the lines. The variation of the standard deviations of 
the distributions of run times and time-headways with the 
increase of the parameter Wji are shown in the insets; the 
discrete points have been obtained from the simulation data 
and the best fit curve through these points has been drawn 
by a line. 

the gamma distribution that is expected for the time- 
headways at suffciently low densities. 
We define 

ri r =< (r- < r >) 2 > 1 / 2 (23) 

as a measure of the fluctuations in the time-headways. 
In the inset of figGJb) we plot function of w, ; 

the best fit to this curve is an exponential. 

In the limit in which u>i2 - > oo and all other rate con- 
stants, except cjji = 1i vanish, our model reduces to 
TASEP if, simultaneously r — > 1. In this limit of our 
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model V T is expected to be well approximated by the ex- 
act expression for TASEP with parallel updating [73, [78J : 



85 




FIG. 6: The distributions of the run times and time-headways 
in our model, under open boundary conditions, plotted in (a) 
and (b), respectively, for two different values of L02i- The 
discrete data points have been obtained from computer sim- 
ulations. The curves fitted to these data points are drawn 
with the lines. The variation of the standard deviation of the 
distributions of run times with the increase of the parameter 
a>2i are shown in the insets; the discrete points have been ob- 
tained from the simulation data and the best fit curve through 
these points has been drawn by a line. 



qy 
.p-y. 



{i - (qy/p)} 



qy 



V{i-p)-y 



{i-( w /(i_p))} 



qy 



qy 

(i-p)-y. 



„*-i 



q 2 {t-l)p 



t-2 



(24) 



where Finally, the transcriptional noise increases, instead of 

_^ decreasing, with the increase of PPi concentration (see 

y = — f^i — y/\ — Aqp(\ — p)J . (25) fig®; in other words, increase of PPi concentration not 
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only slows down the average rate of RNA synthesis, but 
also makes transcription more noisy. 



VII. IMPLICATIONS FOR EXPERIMENTS 

Almost all the quantitative theories of RNAP devel- 
oped so far m m, in m mi, m m, m, m m m were 

intended to account for the mechano-chemistry of a sin- 
gle RNAP. The interactions of RNAPs in transcriptional 
interference [7^ | is a well known phenomenon and it has 
also been modelled quantitatively [42j. However, instead 
of studying interactions of RNAPs during the transcrip- 
tion of different genes [39j, |4fJ, |4l|, |42| , we have modelled 
the steric interactions of RNAPs which are simultane- 
ously involved in the transcription of the same gene. 

The possibility of steric interactions of RNAPs dur- 
ing their traffic-like collective movements along the same 
DNA template has been known for a long time [80l l8l| . 
The " Christmas- tree" -like structures [H, |83| observed in 
electron microscopic studies of eukaryotic transcription 
arise from simultaneous transcription of the same gene by 
many RNAPs. These structures also have strong similar- 
ity with the dense population of the nascent mRNA tran- 
scripts observed all along the loops of the DNA strands 
in the electron micrographs of lampbrush chromosomes 

Our theory predicts not only the average rate of syn- 
thesis of RNA, but also two different measures of fluctua- 
tions in the process of transcription. In most of the earlier 
experimental investigations of transcriptional noise, the 
distributions of the sizes and frequencies of the "burst" 
of the transcriptional activity were recorded. However, 
size and frequency of the bursts depend on the temporal 
resolution used for sorting the time series of the events 
into separate bursts (see, for example, Fig.l of ref.[86|). 
Therefore, in principle, the statistics of reported distri- 
butions of burst sizes and frequencies may change with 
the change of time resolution selected for such sorting. 
Instead, in this paper, we have introduced new measures 
of the stochasticity in transcriptional activity which do 
not require any sorting of this kind. 

In recent years sophisticated optical techniques have 
been developed for single mRNA imaging [U [H, HH, • 
We believe that our theoretical predictions can be tested 
most appropriately by carrying out in-vitro experiments 
with either fluorescently labelled RNAPs [76| or using 
techniques for t agg ing the nascent mRNA with fluores- 
cent probes [47l l87l | or using techniques where fluorescent 
probes can quickly bind with the nascent mRNA as soon 
as it is released by the RNAP [48| . Comparison of our 
theoretical predictions on the distributions of run times 
and time headways require collection of appropriate data. 
In our theory, the run time includes the time spent by 
a RNAP in the elongation stage as well as in the ter- 
mination stage, but does not include the time spent in 
the initiation stage. Therefore, in experiments, run times 
of the RNAps should be measured only from the instant 



when the TEC gets stabilized; a technique used in ref . [76| 
may be utilized for this purpose. 



VIII. SUMMARY AND CONCLUSIONS 

Surprisingly, no attempt has been made in the past to 
develope mathematical models for RNAP traffic where 
transcription of a single gene is carried out simultane- 
ously by a stream of RNAPs closely spaced on the same 
DNA template. To our knowledge, the model developed 
in this paper is the first attempt to capture inter-RNAP 
interactions in a model where the mechano-chemical cy- 
cles of each individual RNAPs in the elongation stage 
are also incorporated, albeit in a simplified manner. In 
analogy with vehicular traffic [44J, we have defined the 
flux for RNAP traffic; the RNAP flux is also the total 
rate of synthesis of RNA. We have calculated the aver- 
age rates of RNA synthesis analytically under mean-field 
approximation. 

Drawing analogies with vehicular traffic, we have 
defined two novel quantities whose distributions serve 
as measures of RNAP-to-RNAP fluctuations in the 
transcription of a single gene. We have calculated these 
distributions numerically by carrying out computer 
simulations of our model. The widths of these distri- 
butions (more precisely, root-mean-square fluctuations) 
can be treated as good measures of the strength of 
"transcriptional noise". We have investigated how the 
level of "transcriptional noise" depends on some of the 
model parameters which can be varied in a controlled 
manner in laboratory experiments. A similar analysis 
of "translational noise" , which arises from ribosome- 
to-ribosome fluctuations during protein synthesis from 
the same mRNA template, will be reported elsewhere 
[91I ]. The inhomogeneous sequence of nucleotides on 
the DNA template can lead to stronger fluctuations 
thereby making additional contributions to the levels of 
transcriptional noise. The "intrinsic noise" studied in 
this paper arises from the stochastic nature of the steps 
of the mechano-chemical cycle of individual RNAPs. 
Although the noise level gets affected by the interactions 
of the RNAPs, this noise remains relevant even when 
the gene is transcribed by one RNAP at a time. We 
have made concrete suggestions as to the experimental 
systems and techniques which, in principle, can be used 
to test our theoretical predictions. 
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